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Abstract. We study the space requirement of m-ary search trees under the random 
permutation model when m > 27 is fixed. Chauvin and Pouyanne have shown 
recently that X n , the space requirement of an m-ary search tree on n keys, equals 
|Lt(n + 1) + 2ReL4n A2 ] + e„rt RcA2 , where /i and A2 are certain constants, A is a 
complex- valued random variable, and e n — > a.s. and in L 2 as n — > 00. Using the 
contraction method, we identify the distribution of A. 

Keywords, m-ary search trees, space requirement, limiting distributions, contrac- 
tion method. 

1 Introduction 

We start by giving a brief overview of search trees, which are fundamental data 
structures in computer science used in searching and sorting. For integer m > 2, 
the m-ary search tree, or multiway tree, generalizes the binary search tree. The 
quantity m is called the branching factor. According to [10], search trees of branch- 
ing factors higher than 2 were first suggested by Muntz and Uzgalis [12] "to solve 
internal memory problems with large quantities of data." For more background we 
refer the reader to [7, 8] and [10]. 

An m-ary tree is a rooted tree with at most m "children" for each node (vertex), 
each child of a node being distinguished as one of m possible types. Recursively 
expressed, an m-ary tree either is empty or consists of a distinguished node (called 
the root) together with an ordered m-tuple of subtrees, each of which is an m-ary 
tree. 

An m-ary search tree is an m-ary tree in which each node has the capacity to 
contain m—1 elements of some linearly ordered set, called the set of keys. In typical 
implementations of m-ary search trees, the keys at each node are stored in increasing 
order and at each node one has m pointers to the subtrees. By spreading the input 
data in m directions instead of only 2, as is the case for a binary search tree, one 
seeks to have shorter path lengths and thus quicker searches. 

We consider the space of m-ary search trees on n keys, and assume that the keys 
are linearly ordered. Hence, without loss of generality, we can take the set of keys 
to be [n] := {1, 2, . . . , n}. We construct an m-ary search tree from a sequence s of 
n distinct keys in the following way: 

(i) If n < m, then all the keys are stored in the root node in increasing order. 
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(ii) If n > m, then the first m — 1 keys in the sequence are stored in the root in 
increasing order, and the remaining n— (m— 1) keys are stored in the subtrees 
subject to the condition that if <7i < 02 < • • • < <r m -i denotes the ordered 
sequence of keys in the root, then the keys in the jth subtree are those that 
lie between Oj-\ and o~j, where o~o ■— and o~ m := n + 1, sequenced as in s. 

(iii) All the subtrees are m-ary search trees that satisfy conditions (i), (ii), and (iii). 

For example the m-ary search constructed from the sequence 

(10, 7, 12, 4, 1, 8, 5, 6, 9, 14, 11, 2, 15, 13, 3) 

is show in Figure 1. Note that empty nodes (also called external nodes) are rep- 
resented as circles in the figure; m such nodes arise as children of a given node 
when that node becomes filled to its capacity of m — 1 keys. In this paper the total 
number of nodes (empty and nonempty) in an m-ary search tree is called the space 
requirement of the tree. 
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Fig. 1. An m-ary search tree with space requirement 13. 



The uniform distribution on the space of permutations of [n] induces a distribu- 
tion of the space of m-ary search trees with n keys. This is known as the random 
permutation model. 

Several authors have studied the limiting distribution of the space requirement 
under the random permutation model. Mahmoud and Pittel [11] showed that when 
m < 15, the limiting distribution is normal. The result was later extended to include 
m < 26 by Lew and Mahmoud [9]. Chern and Hwang [3] proved that when m > 27, 
the space requirement centered by its mean and scaled by its standard deviation 
does not have a limiting distribution. Our result, stated as Theorem 1, for the case 
m > 27 was inspired by a recent development (stated at the beginning of Section 2) 
of Chauvin and Pouyanne [2]. 

2 Summary 

Let X n denote the space requirement of an m-ary search tree on n keys chosen 
under the random permutation model. Recently, Chauvin and Pouyanne [2] have 

used martingale techniques to show that when m > 27, we have X n = X n + n CT e„, 
where 

Xn ■■= rj 1 1 (n + l) + 2Re[n A M], (1) 
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with A some complex-valued random variable and e„ — > a.s. and in L 2 . [In fact, 

they derive the asymptotics of the random vector (Sn\ . . . , S„ n where S$ 
denotes the number of nodes with i keys in a tree with n keys, but we shall be 

content here to study X n — Y^T=o ■] In this representation, A 2 = a + it is the 
root of the polynomial 

4>(z) ee <p m {z) :={z + l)---(z + m-l)-m\ (2) 

having second-largest real part and positive imaginary part. It is our goal to describe 
the distribution of the random variable A. 

To begin, we define the following distributional transform T on A4 2 (fi), the space 
of probability distributions with a certain mean fi defined at (7) and finite second 
absolute moment: 

T: M 2 (») ^ M 2 (fi), £(W) ^ £ (^S^W^j , (3) 

where {Wk)™=i are independent copies of W. Here S = (Si, . . . , S m ) is the vector of 
spacings of m — 1 independent Uniform(0, 1) random variables U\, . . . , U m -\; i.e., if 
U(i), • ■ • , f7( m _i) are their order statistics and ?7(o) := 0, ?7( TO ) := 1, then 

Sj := U u) - U(j-i), j = l,...,m. (4) 

Furthermore, we take S to be independent of (W^fcLi- Next, define the metric d 2 
on M 2 (n) by 

d 2 (F,G) :=min{||X-y|| 2 : C{X) = F, £(Y)=G}, 

with ||X|| 2 := (~E\X\ 2 y/ 2 denoting the L 2 -norm. In the sequel, for notational con- 
venience we will write d 2 (X, Y) instead of d 2 (C(X) , C{Y)) . 

Our main result is the following. (See the remark below Lemma 7 for a strength- 
ening.) 

Theorem 1. Let X n denote the space requirement of an m-ary search tree on n keys 
under the random permutation model with m > 27. Define 

V n :=X n -—^—(n + l) 
tl m - l 

and V n := 2Rc[n A2 Y"]. HereY is a random variable with distribution equal to the 

unique fixed point £(Y) of the distributional transform (3). Then d 2 (V n ,V n ) = o(n a ) 
and consequently A has the same distribution as Y. 

The proof of Theorem 1 is presented in Section 3, with the existence of the unique 
fixed point established in Section 3.1 and bounds on the ^-distance derived in 
Section 3.2. 

Remark. As discussed in [2] and [6] , the study of the random vector (Sn^ , • • • , S 1 ^™ ^ ) 
can be recast as a generalized Polya urn scheme which in turn can be studied by 
embedding into a continuous-time Markov multitype branching process. Janson [6] 
obtains asymptotic distributional results for a very general class of urn schemes 
and multitype branching processes. These include results for m-ary search trees, 
with (1) as a notable example. We anticipate that our contraction-method tech- 
nique for identifying £(A) in (1) will extend quite generally to oscillatory cases of 
Janson's results; this is the subject of ongoing research. □ 
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In the sequel we will use 1 =: Ai, A2, . . . , A TO _i to denote the m — 1 roots of (2) 
in nonincreasing order of real parts and roots with positive imaginary parts listed 
before their conjugates. In [10, §3.3] and [5], the polynomial ip( A) = <p(X — 1) is 
considered. The properties of the roots of <j> that we employ follow immediately 
from those known for the roots of ip. 

3 Proofs 

As preliminaries, note that the space requirement X n has initial conditions Xq = 
X\ = ■ ■ ■ = X m -2 = 1, and for n > to — 1 that the number of keys not stored in 
the root is 

n' := n — (to — 1). 

It is well known that, under the random permutation model, X n satisfies the dis- 
tributional recurrence 

m 

X n = J2 X( h +1 > n>m-l, (5) 
fc=l 

where = denotes equality in law (i.e., in distribution), and where, on the right, 

— the random vector J = (Ji, . . . , J m ) is uniformly distributed over all m-tuplcs 
(ji, • • • ,jm) of nonnegative integers with j\ H h j m = n'\ 

— for each k = 1, . . . , m, we have xj fe ' = Xj; 

— the quantities J; A^ 1} , . . . , X { n ] ] ; x[? ] , . . . , X% } ; . . . ; X^ m) , . . . , X { n T } are all inde- 
pendent. 

Using (5), we get a distributional recurrence for V n , with notation as for the X's: 



V n = T, V J?> n>m-l. (6) 

fe=i 



The initial conditions here are Vj = 1— for j — 0, 1, ... , to— 2. The asymptotics 

of the mean of can be derived using [5, Equation (2.7)]: 

EV n = [in* 2 + fm X3 + 0(n RcX4 ), (7) 

where /i is a constant. Note that no two roots of (2) have the same real part unless 
they are mutually conjugate, so that Re A4 < Re A 3 = Re A2 = o~. 

For the reader's convenience, we state here a part of the Asymptotic Transfer The- 
orem of [5]. We will use this result in Section 3.2. The constant K' can be expressed 
in terms of K, but we shall have no use here for such an expression. 

Proposition 2. For fixed to > 2, consider the recurrence 

, m ^ (n - 1 - A 
an=b n + j^^[ jaj, n>m-l, 

\m—l) j=0 ^ ' 

with specified initial conditions (aj)™^ 2 . If b n = Kn v + o(n v ) with v > 1 and K a 
constant, then 

a n = K'n v + o(n v ) 

where K' is a constant. 
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3.1 Fixed point 

The existence and uniqueness of the fixed point of the map T at (3) follows from the 
contraction method (see, e.g., [13]). Indeed a routine modification of the argument 
presented in [5, §6] yields that T is a contraction on M.2{jj) with contraction factor 



ml 



7X2(7+1)1 1/2 



r(2a + to) 



m! 



-I 1/2 



(2<7 + m- 1) •• • (2ct+ 1) 



< 1, 



since for to > 27, we have a > 1/2 [10, 5]. 



3.2 d-2 bounds 

We begin by defining d n := d,2(V n , V n ) and f(t) := 2 Ret = t + t. Unless otherwise 
noted we will henceforth assume n > to — 1 . Throughout ^ will denote a sum over 

all m-tuples (ji, . . . ,j m ) of nonnegative integers summing to n' . 
By the triangle inequality, 

d n <a n + b n , (8) 
where, taking (Yk)™ =1 to be independent copies of the random variable Y in Theo- 
rem 1 and J and S each independent of (lfe)^! =1 , 



a n :=d 2 iv n ,J2f(J^Y k ) 



(9) 



fe=i 



and 



k ■■= d 2 [J2f( J tY k ),J2f( nX2S t Y k) ■ (io) 

\fe=l fc=l / 

We proceed by deriving upper bounds for a n and 6„ separately. The bound on b n 
is proved as Lemma 4. 

For a n a crude bound can be derived as follows. Even though this bound is not 
sufficient to show that d n = o(n a ), it will be employed in Lemma 6, which in turn 
will be used to derive the estimate that we need. 

Lemma 3. With a n defined at (9), 

a n = 0{n a ). 

Proof. By the triangle inequality, 

an < \\V n \\ 2 + E \\f(jj?Y k )\\ 2 = \\Vnh + mWfiJf'YJh. 
fe=l 

Since J\ < n' and ||li||2 < oo, we have \\f(Ji 2 Yi)\\2 = 0{n a ). Using independence 
of the U W 's, (6), and (7), we have 



|K||2 = ^P[J=j]E 

j 

n—(m—l) 



k=l 



(fe) 



= 7 4 T EEii^iii+ (" 2<7 ) 



fe=l 



TO 



Vm-1/ j=0 



n — 1 — j 
to- 2 



It follows from Theorem 2 that || V^||| = C(n 2<T ), and the result follows. 



□ 
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To sharpen Lemma 3, we employ the following coupling between the distributions 
of V n and of Y^k=i /(^fc 2 ^)- The L 2 distance exhibited by this coupling serves as 

(k) (k) 

an upper bound on the ^-distance. For k — l,...,m, let (Vy , V 2 , . . . ; Yk) be 
independent copies of (Vj., V2, . ■ . ; Y) such that the coupling between Vj and Y is 
^-optimal for each j. [To construct such a coupling, first choose optimally-coupled 
Vi and Y; having chosen (Vi, . . . , Vj; Y), choose Vj+i so that it is optimally-coupled 
with Y .] Then, with J = (Jk)™ =1 independent of everything else, 



Now 



fe=i fe=i 



E p t J =J] 



E^-E/(^) 



fc=i 



fc=i 



(ii) 



E^-E/ctf-n) 



fc=i 



EII^-zc^IG+e E [^-/(j^n)]^-/^)] 

l<fe/J<m 



fc=l 



= E4+ E E[^? ) -/(j* Aa n)]E[v«-/^y I )] 

l<k^l<m 



k=l 



(12) 



If wc choose the mean EF to be /i, it follows from (7) that E [V„ - f(n X2 Y)] 



It follows then that the second sum in (12) is 0(n 2RcXi ) = o(n 2<T ) uni- 



formly in j. Thus, from (11) and (12), 



< E E d >* 

k=l 



+ r n , 



(13) 



where r n = o(n 2a ). 
Next, we proceed to bound b n . 

Lemma 4. With b n defined at (10), 



b n = OK). 



Proof. We take Y\, . . . , Y m to be independent copies of Y and (J, S) independent 



of Yi, . . . , Y m . The conditional distribution of J given S = s = (si 



i) is taken 



to be Multinomial (n', s). Indeed this yields the distribution of the vector of sizes of 
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the subtrees rooted at the root of a random m-ary search tree [4] . Then 

b n < 



fc=l k=l 
m 

<Y}\K J tYk)- f{n X *Sl*Y k )\\ 2 

fc=l 

m 

< 2 E||[ J fc 2 -(^) A2 ]^ 



fe=i 



= 2||F|| 2 £||j A2 -(nS fe ) > 



fe=i 



2m||r|| 2 ||j A2 -(nSi) 



12" 



(by definition of /) 



(by independence) 
(by symmetry) 



We know that ||F|| 2 < oo, and by Lemma 5 to follow the last factor above is o(n a ). 

□ 

Lemma 5. With a > 1/2 denoting ReA 2 , 

||J 1 A2 -(n5 1 ) A2 || 2 =oK). 

Proof. Given e > we will show that the i 2 -norm in question is bounded by a 
constant times e 1 / 2 n' T . The lemma then follows by letting e J. 0. 
Observe that 

||J A2 - {nStf'Wl =E|J 1 A2 - (nS*i) A2 | 2 =EE [|J A2 - (nSi) A2 | 2 | Si]. (14) 

Until further notice assume s > 2e, and note that the conditional expectation 
E [ | J A2 - (nSi) A2 1 2 | Si = s] equals 

n 

^p[Ji=. ? isi = S ]|. ? A2 -m A2 | 2 = + E + E • 

j— 0<j<n(s — e) n(s — e)<j<n(s+e) n(s+e)<j<n 

The conditional distribution of J\ given Si — s is Binomial(n', s). The last sum 
on the right is o(l) uniformly in s since, by [7, Ex. 1.2.10-21], 

P [Ji > n(s + e) | Si = s] < P [Ji > n'(s + e) | Si = s] < exp (-e 2 n'/2). 
For the first sum observe that, for n large enough (independently of s), 

P [Ji < n(s - e) | Si = s] < P Ji < ri (s - Si = s < exp (-e 2 n'/8), 

the last inequality being a consequence of the aforementioned exercise. Thus the 
first sum is also o(l) uniformly in s. 

On the other hand, for the range of summation in the middle sum, by the mean 
value theorem and the assumed inequality e < s/2 we have 



„A2 



< e|A 2 | max ICI^ 1 < e|A 2 |c <T s° 

Ce(s-e,s+e) 
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where c a is (3/2) <T ^ 1 if a > 1 and (l/2) a - 1 if cr < 1. Thus 



|j A2 - (ns) A2 | 2 = n 2CT 



>1 

n 



2 



< e 2 |A 2 | W^V. 



x e 2 n 2 °. 



Hence the middle sum is at most e 2 |A 2 | 2 c 2 s 2 ( CT ^n 2<T . 
Note that Si has distribution Beta(l,m) and that 

Jo r(m + 2<r-l) 

since cr > 1/2. So 

/ E [ |J A2 - (nSi) A2 1 2 | 5i = s] P [Si e ds] < constant 
Finally, 

/ E [ | J A2 - (nSi) A2 1 2 | Si = s] P [Si € ds] 
Jo 

< constant x n 2a V [Si < 2e] < constant x en 2a . 

a 

Combining (8) and (13), we get 

m m m m 

a 2 n < E 5>, 7fc + b Jk ) 2 +r„=E^ a\ + 2E ]T a Jfc 6 Jfc + E ]T 6 2 fc + r„. (15) 
fe=i fc=i fe=i fe=i 

Next we bound the terms on the right-hand side, so that (15) will yield a recursive 
inequality. 

Lemma 6. 

m 

fe=i 

Proof. By linearity of expectation and symmetry, 

m m 
k=l k=l 

Now, the conditional distribution of J\ given Si = s is Binomial (n', s). We show 
that the conditional expectation E [fe 2 ^ | Si = s] is o(n 2a ). To that end, let X be 
distributed Binomial(n, s). For e > 0, 

n 

vb\ = Y.v[x = m= E + E • 

j— 0<j<n(s — e) n(s — e)<j<n 

Now an argument similar to the one used in the proof of Lemma 5 can be employed. 
The first sum on the right is o(n 2a ). On the other hand, we use the fact that 
b n = o(n (y ) from Lemma 4 to conclude that the second sum is o{n 2a ). □ 
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Lemma 7. 

fe=l 

Proof. The proof (using the crude bound on a„ established in Lemma 3) is very 
similar to that of Lemma 6. We omit the details. □ 

We now complete the proof of Theorem 1. Using (15) and Lemmas 7 and 6 we 
find 

m ^ m 

a 2 n < E a \ + 9n = 7~rTT J2 J2 a i„ + = T^TT S 4 + gn 

k=l \m-l) j fe=l Vm-17 j 

n— (m— 1) 



to \ - /n — 1 — j\ 2 
pry E ( m _ 2 

vm — 1/ j=0 x 7 



where g n — o(n 2<T ). It follows from Proposition 2 that a 2 = o(n 2a ), so that d„ < 
a„ + 6„ = o(n <T ), as desired. 

Remark. The o-estimates in Lemmas 4-7 can be improved to O-estimates. In the 
proof of Lemma 5, choosing e as a function of n (specifically, taking e n to be a suit- 
able constant multiple of n~ x / 2 logn) sharpens the estimate o(n CT ) to 0(n a ~ * yTogn), 
so that 6„ = 0{n a ~ * y/\ogn) in Lemma 4. In turn, Lemmas 6 and 7 are then im- 
mediately strengthened to 0(n 2<T_5 Inn) and 0(n 2<J ~ 3 \/logn), respectively. This 
leads to d 2 (V n ,V n ) = 0{n KeXi ) + 0(n a s (logn)i). Numerics strongly support 
the conjecture that a — RCA4 j as to f 00. If this is true, then d 2 (V ni V n ) is 
0(n RoA4 ) whenever to > 1044. Due to the presence of r n = 0(n 2RoA4 ) in (13), this 
large-TO rate of convergence cannot be improved by the methods of this paper and 
presumably is the exact rate. □ 

Finally, to prove equality in distribution of A and Y, we show that d 2 (A, Y) = 0. 
Indeed with A = \A\e te and Y = |F|e lT , we have 

d 2 (Re (n A M), Re {n^Y)) = d 2 (Re (n a+ir \A\e i0 ), Re (n a+iT \Y\e iT )) 

= d 2 {n a \A\ cos(rlnn + 0),n a \Y\ cos(rlnn + T)) . 

But d 2 (Re (n A2 /l), Re (n X2 Y)) = o(n a ) so that, as n —> 00, 

d 2 (|/1| cos(rlnn + 0), |Y| cos(r Inn + T)) -» 0. 

For any fixed 4> £ [0, 2ir) we can choose n — ► 00 such that (t In n) mod 27r — > 0. Then 
cos(0 + 0) and |Y| cos(0 + T) have the same distribution. It follows from the 
Cramer- Wold device [1, Theorem 29.4] that the random vectors (\A\ cos 0, \A\ sin 0) 
and (\Y\ cosT, \ Y\ sinT) have the same distribution. In particular, A = \A\e t0 and 
Y = \Y\e lT have the same distribution, as claimed. This completes the proof of 
Theorem 1. 
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